Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model

نویسندگان

چکیده

Denoising is the essential step for distant supervision based named entity recognition. Previous denoising methods are mostly on instance-level confidence statistics, which ignore variety of underlying noise distribution different datasets and types. This makes them difficult to be adapted high rate settings. In this paper, we propose Hypergeometric Learning (HGL), a algorithm distantly supervised NER that takes both into consideration. Specifically, during neural network training, naturally model samples in each batch following hypergeometric parameterized by noise-rate. Then instance regarded as either correct or noisy one according its label derived from previous training step, well sampled batch. Experiments show HGL can effectively denoise weakly-labeled data retrieved supervision, therefore results significant improvements trained models.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supervised Named Entity Recognition for Clinical Data

Clinical Named Entity Recognition is a part of Task 1b, organised by CLEF eHealth organisation in 2015. The aim is to automatically identify clinically relevant entities in medical text in French. A supervised learning approach has been used for training the tagger. For the purpose of training, Conditional Random Fields(CRF) has been used. An extensive set of features was used for training. Pre...

متن کامل

A Semi-supervised Learning Approach to Arabic Named Entity Recognition

We present ASemiNER, a semisupervised algorithm for identifying Named Entities (NEs) in Arabic text. ASemiNER does not require annotated training data, or gazetteers. It also can be easily adapted to handle more than the three standard NE types (Person, Location, and Organisation). To our knowledge, our algorithm is the first study that intensively investigates the semi-supervised pattern-based...

متن کامل

A Simple Semi-supervised Algorithm For Named Entity Recognition

We present a simple semi-supervised learning algorithm for named entity recognition (NER) using conditional random fields (CRFs). The algorithm is based on exploiting evidence that is independent from the features used for a classifier, which provides high-precision labels to unlabeled data. Such independent evidence is used to automatically extract highaccuracy and non-redundant data, leading ...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Named Entity Chunking Techniques in Supervised Learning for Japanese Named Entity Recognition

This 1)aper focuses on the issue of named entity chunking in Japanese named entity recognition. We apply the SUl)ervised decision list lean> ing method to Japanese named entity recognition. We also investigate and in(:ori)orate several named-entity noun phrase chunking tech.niques and experimentally evaluate and con> t)are their l)erfornlanee, ill addition, we t)rot)ose a method for incorporati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i16.17702